A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research
Identifieur interne : 000F55 ( Main/Exploration ); précédent : 000F54; suivant : 000F56A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research
Auteurs : Saeed Mozaffari [Iran] ; Karim Faez [Iran] ; Farhad Faradji [Iran] ; Majid Ziaratban [Iran] ; S. Mohamad Golzan [Iran]Source :
English descriptors
Abstract
This paper presents a new comprehensive database for isolated offline handwritten Farsi/Arabic numbers and characters for use in optical character recognition research. The database is freely available for academic use. So far no such a freely database in Farsi language is available. Grayscale images of 52,380 characters and 17,740 numerals are included. Each image was scanned from Iranian school entrance exam forms during the years 2004-2006 at 300 dpi. The only restriction imposed on the writers is to write each character within a rectangular box. The number of samples in each class of the database is non-uniform corresponding to their real life distributions. Also, for comparison purposes, each dataset has been properly divided into respective training and test sets.
Url:
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Hal, to step Corpus: 000004
- to stream Hal, to step Curation: 000004
- to stream Hal, to step Checkpoint: 000129
- to stream Main, to step Merge: 000F70
- to stream Main, to step Curation: 000F55
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research</title>
<author><name sortKey="Mozaffari, Saeed" sort="Mozaffari, Saeed" uniqKey="Mozaffari S" first="Saeed" last="Mozaffari">Saeed Mozaffari</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Faradji, Farhad" sort="Faradji, Farhad" uniqKey="Faradji F" first="Farhad" last="Faradji">Farhad Faradji</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Golzan, S Mohamad" sort="Golzan, S Mohamad" uniqKey="Golzan S" first="S. Mohamad" last="Golzan">S. Mohamad Golzan</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:inria-00112676</idno>
<idno type="halId">inria-00112676</idno>
<idno type="halUri">https://hal.inria.fr/inria-00112676</idno>
<idno type="url">https://hal.inria.fr/inria-00112676</idno>
<date when="2006-10-23">2006-10-23</date>
<idno type="wicri:Area/Hal/Corpus">000004</idno>
<idno type="wicri:Area/Hal/Curation">000004</idno>
<idno type="wicri:Area/Hal/Checkpoint">000129</idno>
<idno type="wicri:Area/Main/Merge">000F70</idno>
<idno type="wicri:Area/Main/Curation">000F55</idno>
<idno type="wicri:Area/Main/Exploration">000F55</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research</title>
<author><name sortKey="Mozaffari, Saeed" sort="Mozaffari, Saeed" uniqKey="Mozaffari S" first="Saeed" last="Mozaffari">Saeed Mozaffari</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Faradji, Farhad" sort="Faradji, Farhad" uniqKey="Faradji F" first="Farhad" last="Faradji">Farhad Faradji</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Golzan, S Mohamad" sort="Golzan, S Mohamad" uniqKey="Golzan S" first="S. Mohamad" last="Golzan">S. Mohamad Golzan</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING"><orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc><address><country key="IR"></country>
</address>
</desc>
<listRelation><relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-307924" type="direct"><org type="institution" xml:id="struct-307924" status="INCOMING"><orgName>Amirkabir University of Technology, Tehran</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="mix" xml:lang="en"><term>Comparative database</term>
<term>Farsi/Arabic</term>
<term>OCR</term>
<term>isolated numbers and characters</term>
<term>offline</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This paper presents a new comprehensive database for isolated offline handwritten Farsi/Arabic numbers and characters for use in optical character recognition research. The database is freely available for academic use. So far no such a freely database in Farsi language is available. Grayscale images of 52,380 characters and 17,740 numerals are included. Each image was scanned from Iranian school entrance exam forms during the years 2004-2006 at 300 dpi. The only restriction imposed on the writers is to write each character within a rectangular box. The number of samples in each class of the database is non-uniform corresponding to their real life distributions. Also, for comparison purposes, each dataset has been properly divided into respective training and test sets.</div>
</front>
</TEI>
<affiliations><list><country><li>Iran</li>
</country>
</list>
<tree><country name="Iran"><noRegion><name sortKey="Mozaffari, Saeed" sort="Mozaffari, Saeed" uniqKey="Mozaffari S" first="Saeed" last="Mozaffari">Saeed Mozaffari</name>
</noRegion>
<name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<name sortKey="Faradji, Farhad" sort="Faradji, Farhad" uniqKey="Faradji F" first="Farhad" last="Faradji">Farhad Faradji</name>
<name sortKey="Golzan, S Mohamad" sort="Golzan, S Mohamad" uniqKey="Golzan S" first="S. Mohamad" last="Golzan">S. Mohamad Golzan</name>
<name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F55 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000F55 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Hal:inria-00112676 |texte= A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research }}
This area was generated with Dilib version V0.6.32. |